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Abstract/Project  Summary 

This  project  was  concerned  with  the  development  of  a  methodology  for  the  specification 
of  novel  biosynthetic  pathways  towards  organic  compounds.  Our  overall  objective  is  to 
expand  the  potential  for  biological  production  of  small  molecules,  especially  for 
compounds  that  have  either  unknown  or  intractable  natural  routes.  As  a  model 
compound,  we  chose  to  design  and  assemble  pathways  for  the  production  of  glucaric 
acid,  a  “top  value-added  compound  from  biomass”  that  has  a  fully  elucidated  but  very 
lengthy  biological  route.  As  an  alternative  to  the  natural  pathway,  we  designed  5 
potential  routes  to  glucaric  acid  and  chose  two  to  implement  in  a  recombinant 
Escherichia  coli  host.  During  the  granting  period,  we  successfully  assembled  one 
pathway,  the  so-called  “benchmark  pathway,”  that  resulted  in  the  production  of  glucaric 
acid  at  over  1  g/L.  Pathway  assembly  required  the  isolation  of  a  gene  encoding  uronate 
dehydrogenase  from  the  bacterium  Pseudomonas  putida.  In  order  to  improve  flux 
through  the  pathway,  we  initiated  a  collaboration  with  investigators  at  the  University  of 
California,  Berkeley,  to  employ  novel  enzyme  co-localization  techniques.  A  second 
collaboration  was  initiated  to  perform  computation-based  enzyme  engineering  for  the 
second  pathway.  Finally,  a  publicly-available  database  of  enzymatic  transformations 
(www.retro-biosynthesis.com)  was  created  to  aid  future  pathway  designs. 
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I.  Scientific  and  Technical  Objectives 

Our  overall  objective  is  to  facilitate  the  rational  design  of  biosynthetic  pathways  for  the 
production  of  unnatural  compounds.  We  are  developing  tools  and  attempting  to  elucidate 
rules  to  establish  a  framework  for  “retro-biosynthesis,”  the  practice  of  rationally 
proposing  a  synthetic  scheme  for  a  target  compound  from  one  or  more  starting  substrates, 
based  only  on  enzyme-mediated  transfonnations.  We  are  especially  focused  on 
experimental  realization  of  microbial  synthesis,  and  the  limits  thereof,  in  contrast  to 
theoretical  predictions.  A  critical  tool  is  protein  engineering  for  altered  substrate 
specificity,  either  through  rational  protein  design  or  random  mutagenesis  (directed 
evolution). 

We  have  chosen  glucaric  acid  as  a  model  compound  for  pathway  proposition  and 
assembly  in  Escherichia  coli.  This  compound  is  naturally-occurring  in  plants  and 
mammals,  but  a  microbial  pathway  has  not  been  established.  Additionally,  the  molecule 
has  been  declared  a  “top-ten  value  added  compound  from  glucose”  (Werpy,  T.  and  G. 
Petersen  (2004).  “Top  value  added  chemicals  from  biomass.  Volume  I:  Results  of 
screening  for  potential  candidates  from  sugars  and  synthesis  gas.”  National  Renewable 
Energy  Lab  (NREL)  and  Pacific  Northwest  National  Lab  (PNNL)).  The  pathway  that  has 
been  elucidated  in  mammals  is  quite  complex,  consisting  of  more  than  10  reaction  steps 
and  an  integration  with  the  pentose  phosphate  pathway.  Moreover,  the  primary  product 
of  this  natural  biosynthetic  route  is  L-ascorbic  acid  (vitamin  C).  Thus,  we  chose  to 
propose  rationally-designed  pathways  for  the  production  of  glucaric  acid  in  a  microbial 
host. 

Specific  objectives  are  to:  (1)  propose  several  pathways  for  glucaric  acid  synthesis,  (2) 
select  at  least  one  pathway  for  experimental  study,  and  (3)  establish  microbial  synthesis 
of  glucaric  acid.  A  fourth  objective  arising  from  this  work  is  the  development  of  a 
database  for  the  re-classification  of  enzyme  activities  based  only  on  substrate  and  product 
functional  groups.  The  database  is  designed  to  enable  searches  more  amenable  to 
enzyme  selection  for  biosynthetic  pathway  proposition. 

II.  Approach 

We  propose  pathways  using  an  approach  that  is  analogous  to  that  used  by  organic 
chemists  in  proposing  synthesis  schemes.  Key  to  this  approach  is  the  following: 

•  We  consider  enzymes  as  interchangeable  “parts”  that  can  be  freely  imported  into 
the  host  organism  of  choice  irrespective  of  origin.  We  recognize  that  these 
enzyme  parts  may  need  to  be  re-engineered  for  optimal  activity  in  the  target  host. 

•  We  believe  that  advances  in  protein  engineering  permits  the  specification  of 
pathways  by  considering  primarily  the  functional  specificity  of  an  enzyme  and 
assuming  that  this  enzyme  may  need  to  be  altered  to  accept  new  substrates. 

We  proposed  biosynthetic  pathways  towards  glucaric  acid  by  searching  the  available 
databases  for  both  specific  conversions  (e.g.,  the  production  of  glucaric  acid  from 
glucuronic  acid  by  Pseudomonas  syringae  uronate  dehydrogenase)  and  generalized 
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enzyme  functions  (e.g.,  the  conversion  of  an  aldehyde  to  a  carboxylic  acid  through  EC 
class  1.2.1  enzymes).  Our  proposed  pathways  are  illustrated  in  Figure  1.  We  do  not 
attempt  an  exhaustive  search  of  all  theoretical  conversions,  but  rather  focus  on  those  most 
likely  realizable  based  on  the  prevalence  of  enzymes  within  a  certain  reaction  class. 
Given  a  pathway  design,  we  attempt  its  construction  by  recruiting  enzyme  activities 
through  PCR  amplification  of  known  genes,  chemical  synthesis  of  codon-optimized  DNA 
fragments  for  expression  in  E.  coli,  and  cloning  of  un-sequenced  genes  using  genomic 
DNA  libraries  and/or  protein  purification.  As  stated  previously,  successful  construction 
of  some  pathways  will  require  enzyme  engineering  to  achieve  desired  conversion  steps. 


We  have  identified  one  pathway  that  utilizes  only  naturally-occurring  enzymes  but  from 
disparate  sources  (Figure  1,  PW1).  A  second  designed  pathway  has  been  selected  that 
requires  enzyme  engineering  for  two  of  three  reaction  steps  (Figure  1,  PW2).  We  have 
established  collaborators  in  protein  design  to  achieve  this  goal. 

III.  Accomplishments 

Our  goals  begin  with  specifying  pathways  for  production  of  the  target  compound  of 
interest.  We  first  conducted  experiments  designed  to  validate  our  choice  of  target 
compound  to  explore  principles  of  retro-biosynthesis,  then  proceeded  with  pathway 
specification.  The  methodologies  used  to  propose  5  pathways  and  select  two  for 
assembly  have  already  been  described  in  Section  II  and  will  be  elaborated  upon  here. 
Following  pathway  specification,  we  aim  to  assemble  two  different  routes  for  microbial 
production  of  glucaric  acid.  The  first  pathway,  PW1,  requires  an  enzyme  that  prior  to  the 
start  of  this  project,  had  not  been  cloned.  Hence,  assembly  of  this  pathway  required  us  to 
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first  identify  the  gene  coding  for  the  enzyme.  The  second  pathway,  PW2,  requires 
engineered  enzymes  and  thus  required  us  to  establish  research  collaborations  to 
accomplish  this  work. 

A.  Evaluation  of  glucaric  acid  toxicity  and  metabolism  in  E.  coli 

To  successfully  construct  a  heterologous  pathway  for  unnatural  product  formation,  it  is 
best  if  the  target  compound  has  limited  effect  on  host  growth.  We  evaluated  the  growth 
of  E.  coli  strain  DH10B  in  both  LB  (complex)  and  M9/glucose  minimal  media  with 
varying  concentrations  of  glucaric  acid.  In  all  cases,  the  media  was  adjusted  to  pH  7.4 
following  the  addition  of  glucaric  acid.  The  results  in  LB  indicated  that  growth  rates 
were  unaffected  by  glucaric  acid,  while  maximum  cell  densities  increased  with  increasing 
glucaric  acid.  This  is  consistent  with  reports  that  E.  coli  can  metabolize  glucaric  acid, 
but  also  demonstrates  that  it  provides  no  growth  advantage  in  rich  medium.  Minimal 
medium  was  used  to  evaluate  the  effect  of  glucaric  acid  on  both  growth  and  glucose 
uptake  rates.  Glucaric  acid  concentrations  up  to  50  mM  (~  1 0  g/L)  produced  no  effect  on 
growth  or  glucose  uptake  rate.  Maximum  cell  densities  also  appeared  to  be  independent 
of  glucaric  acid,  suggesting  catabolite  repression.  We  subsequently  confirmed  that 
neither  the  DH10B  nor  BL21(DE3)  strains  of  E.  coli  will  metabolize  glucaric  acid  in  the 
presence  of  glucose.  This  fact  allowed  us  to  test  the  production  of  glucaric  acid  in  these 
strains  without  the  need  to  mutate  the  host  to  prevent  glucaric  acid  consumption.  Using 
excess  glucose  to  promote  catabolite  repression  of  glucaric  acid-utilizing  enzymes 
enabled  a  rapid  determination  of  pathway  success. 

B.  Specification  and  selection  of  biosynthetic  pathways 

Various  routes  were  proposed  to  glucaric  acid  by  considering  functional  group 
transformations  that  could  achieve  the  target  product  and  working  backwards  to  arrive  at 
a  starting  substrate  that  would  be  readily  taken  up  by  or  produced  within  the  cell  (Figure 
1).  One  pathway,  PW1,  was  developed  by  restricting  the  database  search  to  known 
conversions  only.  Since  glucaric  acid  is  an  oxidized  form  of  glucose,  it  is  reasonable  to 
expect  that  glucose  can  serve  as  such  a  substrate.  Four  of  the  five  pathways  do  originate 
from  glucose.  Pathway  3  originates  from  sorbitol;  however,  sorbitol  can  be  produced 
enzymatically  from  glucose.  Therefore,  all  five  pathways  could  originate  from  glucose. 
In  selecting  conversion  steps,  we  attempted  to  restrict  ourselves  to  enzyme 
transformations  that  were  most  common,  as  reflected  in  a  substantial  number  of  enzymes 
within  a  particular  3-digit  EC  group.  We  believe  that  common  transfonnations  are  more 
likely  to  be  successfully  engineered  for  new  substrates.  In  this  manner,  we  excluded 
reactions  such  as  the  direct  oxygenation  of  glucose,  in  favor  of  the  two-step  oxidation  of 
the  primary  alcohol  of  glucose  to  an  aldehyde,  followed  by  oxidation  of  the  aldehyde  to  a 
carboxylic  acid  (Figure  2,  PW2).  Note  that  three  of  the  five  pathways  proceed  through 
glucuronic  acid  (in  either  of  two  equilibrium  forms)  as  the  penultimate  compound, 
suggesting  an  important  role  of  the  glucuronic  acid  to  glucaric  acid  converting  enzyme, 
uronate  dehydrogenase. 
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As  stated  previously,  Pathway  1  was  assembled  by  restricting  the  database  search  to 
known  bioconversions.  That  is,  there  is  documentation  in  the  print  literature  and  online 
databases  that  establishes  (1)  the  production  of  glucaric  acid  from  glucuronic  acid  by 
Pseudomonas  syringae  uronate  dehydrogenase;  (2)  the  conversion  of  myo-inositol  to 
glucuronic  acid  with  the  mammalian  myo-inositol  oxygenase  (MIOX);  and  (3)  the 
production  of  myo-inositol  from  glucose  by  inositol  3 -phosphate  synthase  from  S. 
cerevisiae.  Because  each  of  these  reactions  was  known,  PW1  was  designated  the 
benchmark  pathway.  Our  objective  is  to  propose  an  alternative  pathway  with 
productivity  above  the  benchmark  pathway.  Pathway  2  was  chosen  as  the  test  pathway 
because  it  has  only  3  steps,  utilizes  one  enzyme  of  the  benchmark  pathway,  and  its  first 
two  steps  are,  generally  speaking,  common  enzymatic  transformations. 

C.  Cloning  uronate  dehydrogenase  activity  from  Pseudomonas  syringae 

Three  of  the  five  proposed  pathways,  including  both  of  our  selected  pathways  for 
implementation  (PW1  and  PW2),  rely  upon  the  activity  of  an  enzyme  that  has  been 
described  in  P.  syringae  but  for  which  the  corresponding  gene  had  not  been  identified. 
This  enzyme  is  uronate  dehydrogenase,  EC  1.1.1.203,  and  it  catalyzes  the  final 
conversion  step  of  glucuronic  acid  to  glucaric  acid.  Thus,  we  first  verified  the  existence 
of  the  desired  activity  in  the  native  organism.  The  conversion  of  glucuronic  acid  to 
glucaric  acid  by  Pseudomonas  syringae  grown  on  either  of  these  compounds  but  not  on 
glucose  has  been  reported.  However,  the  published  report  demonstrated  the  conversion 
through  monitoring  of  NADH  production.  To  verify  this  activity,  we  grew  P.  syringae  in 
both  gluose  and  glucaric  acid,  prepared  cell-free  lysates,  and  tested  for  glucuronic  acid¬ 
converting  activity.  As  expected,  NADH  co-factor  was  produced  in  the  presence  of 
glucuronic  acid  and  when  cells  were  grown  on  glucaric  acid,  but  was  not  produced  in  the 
absence  of  the  substrate  or  when  cells  were  grown  on  glucose.  We  have  also  developed 
methods  for  the  separation  of  glucuronic  acid  and  glucaric  acid  followed  by  HPLC 
analysis.  The  separation  properties  and  HPLC  retention  times  confirmed  that  P.  syringae 
contains  glucaric  acid-producing  activity. 

In  attempting  to  establish  a  simple  growth-coupled  screen  to  isolate  uronate 
dehydrogenase,  we  observed  a  growth  difference  in  E.  coli  strain  DH10B  on  glucaric  and 
glucuronic  acid  which  we  hoped  to  exploit  as  the  basis  for  such  a  screen.  This  method 
encountered  difficulties  when  we  observed  that  plasmid-transformed  cells  exhibited 
markedly  different  growth  behavior  compared  to  plasmid- free  cells.  As  an  alternative, 
we  reviewed  the  metabolic  pathways  for  growth  of  E.  coli  on  both  glucuronic  and 
glucaric  acids  and  determined  that  catabolism  proceeded  through  two  unrelated 
pathways.  Thus,  by  intentionally  disrupting  the  first  step  for  glucuronic  acid 
consumption  (uxaC)  while  retaining  the  route  for  glucaric  acid  consumption,  we  could 
screen  for  uronate  dehydrogenase  activity  by  growth  of  an  E.  coli  library  transformed 
with  P.  putida  genomic  DNA  on  minimal  medium  containing  only  glucuronic  acid  as  a 
carbon  source.  Resulting  cells  might  then  contain  uronate  dehydrogenase,  which  would 
produce  and  consume  glucaric  acid  for  growth  (Figure  2).  Using  a  uxaC  mutant 
harboring  a  P.  syringae  genomic  DNA  library,  we  successfully  identified  open  reading 
frame  PSPTO  1053  as  coding  for  uronate  dehydrogenase  activity  (deposited  as  GenBank 
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Accession  Number  EU377538).  From  this  sequence,  homologues  were  identified  and 
tested  in  P.  putida  and  Agrobacterium  tumefaciens  (Figure  3).  A  manuscript  describing 
this  cloning  of  uronate  dehydrogenase  from  these  three  organisms  has  been  published 
(Yoon  et  ah,  2009). 
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Figure  2.  Catabolism  of  glucuronic  and  glucaric  acids  in  E.  coli 
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Figure  3.  Uronate  dehydrogenase  activities  of  clones  harboring  udh  gene  of  A.  tumefaciens  str.  C58 
(pTATudh2),  P.  putida  KT2440  (pTPPudh),  and  P.  syringae  pv.  tomato  str.  DC3000  (pTPSudh).  Open  and 
solid  bars  represent  cultures  grown  without  or  with  induction  by  0.1  mM  IPTG,  respectively. 


While  cloning  the  udh  gene  as  described  above,  we  also  pursued  secondary, 
bioinformatics-based  approaches  towards  cloning  this  gene.  Based  solely  on  the 
chemistry,  enzymes  of  the  class  1.2.1.x  should  be  capable  of  converting  glucuronic  acid 
to  glucaric  acid.  The  genome  of  P.  syringae  is  sequenced,  and  there  are  8  ORFs  with 
putative  1.2.1  activity.  We  cloned  each  of  these  8  genes  and  tested  them  for  uronate 
dehydrogenase  activity;  however,  all  8  failed  to  display  the  desired  activity.  The  udh  that 
we  did  identify  was  not  consistent  with  these  open  reading  frames.  More  recently,  a 
“uronate  dehydrogenase”  sequence  from  grape  was  deposited  in  GenBank  (Accession 
DQ843600).  We  obtained  a  synthesized  version  of  this  gene  and  it  also  did  not  display 
the  desired  activity.  The  latter  gene  displayed  very  little  (10%)  identity  with  our  uronate 
dehydrogenases.  Although  difficulties  in  expression  of  a  grape  gene  in  E.  coli  cannot  be 
ruled  out,  the  low  homology  leads  us  to  believe  that  this  gene  is  either  not  in  fact  a 
uronate  dehydrogenase,  or  it  is  a  highly  divergent  version.  This  experience  emphasizes 
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the  need  for  additional  experimental  results  to  guide  bioinformatic  efforts  to  assign 
enzyme  function  from  sequence  data  alone. 

D.  Assembly  of  the  benchmark  pathway 

The  benchmark  pathway  consists  of  myo-inositol-1  -phosphate  synthase,  encoded  by  the 
INOl  gene,  from  yeast  ( Saccharomyces  cerevisiae );  myo-inositol  oxygenase,  MI  OX, 
from  mouse;  and  uronate  dehydrogenase,  Udh,  from  the  bacterium  Pseudomonas 
syringae,  to  produce  glucaric  acid  from  glucose.  The  INOl  gene  was  amplified  through 
PCR  amplification  from  the  yeast  genome  and  synthesized.  The  MIOX  gene  was 
synthesized  by  DNA  with  codon  usage  optimized  for  expression  in  E.  coli.  The  mouse 
variant  of  this  enzyme  was  chosen  because  of  previous  reports  of  its  functional 
expression  in  E.  coli  (Arner  et  al.  2004.  Biochem.  Biophys.  Res.  Comniun.  324:1386- 
1392).  Similarly,  INOl  expression  had  been  shown  to  lead  to  high  levels  of  myo- inositol 
accumulation  in  recombinant  E.  coli  cultures  (Niu  et  al.  2003.  J.A.C.S.  125 : 12998— 
12999).  To  establish  the  benchmark  pathway,  we  first  set  out  to  produce  glucuronic  acid 
through  the  expression  of  recombinant  INOl  and  MIOX  genes. 

Expression  studies  revealed  that  both  enzymes  needed  high  gene  dosage  levels  to  result 
in  accumulation  of  significant  amounts  of  their  respective  products  (myo-inositol  and 
glucuronic  acid  in  the  culture  medium).  We  subsequently  co-expressed  the  two  genes, 
each  under  the  control  of  a  separate  T7  promoter,  and  achieved  production  of  glucuronic 
acid  from  glucose  at  ~0.3  g/L  (Figure  4).  The  resulting  profiles  indicated  that  MIOX 
activity  was  rate-limiting,  as  evidenced  by  an  accumulation  of  myo-inositol  and  low 
activity  of  the  MIOX  enzyme  (data  not  shown). 


Figure  4.  Production  of  glucuronic  acid  from  glucose  in  E.  coli.  Cultures  were  grown  in  triplicate  at  30°C 
in  LB  medium  supplemented  with  10  g/L  glucose  and  0.1  mM  IPTG.  Data  points  are  the  average  and 
standard  deviation  of  the  three  biological  replicates.  ▲  =  Glucuronic  acid;  ■  =  /Myo-inositol;  ♦  =  Glucose. 

The  cloned  udh  from  P.  syringae  was  next  co-expressed  with  the  first  two  enzymes  of  the 
pathway,  leading  to  production  of  glucaric  acid  from  glucose.  Although  the  highest 
recombinant  Udh  activities  were  observed  with  the  cloned  gene  from  A.  tumefaciens,  the 
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activity  of  Udh  from  P.  syringae  was  two  orders  of  magnitude  higher  than  INOl  and 
three  orders  of  magnitude  higher  than  MIOX  (data  not  shown).  Thus,  it  was  sufficient  to 
observe  glucaric  acid  production  (Table  1).  Interestingly,  significantly  more  glucaric 
acid  was  produced  in  the  three  enzyme  system  than  glucuronic  acid  produced  from  the 
two  enzyme  system.  We  believe  that  the  high  activity  of  Udh  effectively  pulls  flux 
through  the  system,  resulting  in  much  higher  glucaric  acid  concentrations.  Manipulation 
of  inducer  concentrations  produced  a  titer  of  ~1  g/L.  This  is  the  first  demonstration  of 
microbial  production  of  glucaric  acid,  a  “top  value-added ”  product  that  can  be 
produced  from  biomass.  A  manuscript  describing  the  establishment  of  this  pathway  has 
been  published  (Moon  et  ah,  2009a). 


Table  1.  Production  of  glucaric  acid  from  glucose  after  3  days  culture.  Cultures  were  grown  at  30°C  in 
LB  medium  supplemented  with  10  g/L  glucose  and  induced  with  IPTG.  OD60o  =  optical  density  at  600  nm. 
Yield  (%)  =  100  x  glucaric  acid  produced  /  glucose  consumed  (mol/mol).  Condition  A  =  0.1  mM  IPTG  at 
0  hr;  Condition  B  =  0.05  mM  IPTG  at  0  hr;  Condition  C  =  0.05  mM  IPTG  at  0  hr  and  0.1  mM  IPTG  at  17.5 
hr.  N/D  =  not  detectable. 


Condition 

OD60o 

Glucose 

(g/L) 

myo¬ 

inositol 

(g/L) 

Glucuronic 

Acid 

(g/L) 

Glucaric 

Acid 

(g/L) 

Yield 

(%) 

A 

5.0 

6.5 

0.09 

N/D 

0.82 

20.0 

B 

6.3 

1.8 

0.13 

N/D 

1.13 

11.9 

C 

5.6 

3.6 

0.17 

N/D 

0.88 

11.8 

E.  Improvement  of  productivity  through  co-localization  of  enzymes 

The  accumulation  of  myo-inositol  in  the  culture  medium,  combined  with  undetectable 
levels  of  glucuronic  acid  indicated  that  MIOX  activity  was  rate-limiting  in  the  system. 
Additional  studies  also  showed  that  MIOX  activity  was  strongly  dependent  on  the 
presence  of  myo- inositol  in  the  system,  an  observation  that  had  been  previously  reported 
(Arner  et  al.,  2004).  In  an  attempt  to  increase  the  titer,  we  formed  a  collaboration  with 
Dr.  John  Dueber  at  the  University  of  California,  Berkeley,  to  utilize  synthetic  scaffolds  to 
co-localize  the  INOl  and  MIOX  proteins.  The  INOl  and  MIOX  proteins  were  tagged 
with  ligands,  and  a  special  scaffold  was  produced  with  the  ligand-binding  peptides  in 
different  stiochiometries.  This  results  in  INOl  and  MIOX  molecules  being  co-localized 
within  the  cytoplasm  of  the  cell.  Our  hypothesis  was  that  co-localization  might  increase 
the  local  concentration  of  myo-inositol  (by  preventing  dilution  by  diffusion),  thereby 
impacting  the  MIOX  activity.  Utilizing  the  scaffolds  resulted  in  improvements  in 
glucaric  acid  titers  up  to  nearly  2  g/L.  The  use  of  synthetic  biology  parts  facilitates  the 
improvement  of  the  system,  towards  titers  necessary  to  validate  biological  synthesis  as  a 
suitable  route  for  glucaric  acid  production.  This  work  has  since  been  published  in 
Nature  Biotechnology  (Dueber  et  al.,  2009). 

E.  Collaboration  for  protein  design 

In  addition  to  the  benchmark  pathway  utilizing  naturally-occurring  enzymes,  we  have 
proposed  a  designed  pathway  based  only  on  generalized  enzyme  transfonnations  (Figure 
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1,  PW2).  This  alternative  pathway  requires  two  engineered  enzymes,  and  we  established 
a  collaboration  with  Codon  Devices  (Cambridge,  MA)  to  re-design  the  enzyme 
catalyzing  the  first  transformation  step.  This  reaction  is  the  oxidation  of  glucose  to  form 
glucodialdose.  A  galactose  oxidase  is  known  to  catalyze  the  desired  reaction  using 
galactose  as  a  substrate.  Hence,  the  objective  of  the  protein  engineering  project  is  to  use 
computational  methods  to  re-design  the  enzyme  for  activity  on  glucose,  to  achieve  higher 
activities  than  what  has  previously  been  reported.  We  received  the  first  library,  a  control 
library,  to  screen  for  a  new  glucose  oxidase  activity  from  Codon  Devices  towards  the  end 
of  the  reporting  period. 

F.  Modeling  designed  pathways 

We  have  also  collaborated  with  Prof.  Alfonso  Jaramillo  at  Ecole  Polytechnique  (France) 
on  the  development  of  a  mathematical  model  to  predict  the  metabolic  burden  imposed  by 
the  expression  of  heterologous  pathways.  The  model  is  designed  to  both  estimate  the 
demand  required  to  transcribe  and  translate  plasmid-encoded  genes,  and  to  consider  the 
impact  of  a  heterologous  pathway  on  growth  rate  through  the  use  of  a  stoichiometric  flux- 
balance  model.  These  types  of  models  will  be  useful  as  a  metric  to  choose  from  among 
many  different  options  that  will  necessarily  result  from  designed  biosynthetic  pathways. 
This  work  has  been  published  in  Bioinformatics  (Rodrigo  et  ah,  2008). 

G.  Development  of  novel  experimental  devices  for  controlling  glucose  flux 

We  have  designed  a  strategy  to  control  flux  of  glucose  between  endogenous  metabolism 
and  synthetic  pathways  based  on  altering  transport  (Figure  5).  Glucose  normally  enters 
the  cell  through  the  phosphotransferase  system  (PTS),  and  is  converted  to  glucose-6- 
phosphate  before  entry  into  the  cytoplasm.  It  can  then  enter  glycolysis  or  the  pentose 
phosphate  pathway.  An  alternative  transport  system  can  be  developed  by  eliminating  the 
PTS  and  instead  using  the  galactose  permease  transporter.  Glucose  that  enters  through 
the  permease  is  phosphorylated  by  glucokinase  after  entry  into  the  cytoplasm.  We  have 
therefore  designed  a  “metabolite  valve”  to  modulate  glucokinase  activity  in  order  to 
control  the  distribution  of  glucose  that  enters  endogenous  metabolism  versus  our  test 
pathway  for  glucaric  acid  synthesis. 


Closed  Valve 

GlUe* 

“'W  FI 

Open  Valve 

extracellular  ®^ext 

medium  ^  ;  pjS 

®  I 

pi  ® 

cytoplasm 

Glk  ^G,Uint 

\heterologous 
^  pathway 

Glu-6-P 

/V  \ 

,  glucaric  acid 

endogenous 

metabolism 

as  -glk  \/ 

‘  /Glk  G*Uint 

'  KJlKy'  ^[Heterologous 

*  pathway 

Glu-6-P  ^ 

/  \  * 

.  glucaric  acid 

endogenous 

metabolism 

Figure  5.  Design  of  a  Glucose  “Metabolite  Valve.”  Altering  the  glucose  transport  system  allows  the  molecule  to  enter 
in  a  non-phosphorylated  form.  Controlling  phosphorylation  should  control  the  entry  of  glucose  into  endogenous 
metabolism.  PTS  =  PEP-dependent  phosphotransferase,  GalP  =  galactose  permease,  Gif  =  glucose  facilitator  protein, 
Glk  =  glucokinase.  Glu  =  glucose  (ext  =  extracellular,  int  =  intracellular),  Glu-6-P  =  glucose-6-phosphate.  as -glk  = 
antisense  RNA  directed  towards  glucokinase  mRNA. 
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We  constructed  several  strains  with  mutations  in  components  of  the  PTS  as  well  as  with 
overexpression  of  the  galactose  permease  (gal?)  and  glucokinase  (glk)  to  achieve 
maximal  growth  in  the  absence  of  an  engineered  pathway.  Antisense  transcripts  intended 
to  modulate  Glk  activity  were  also  designed.  While  this  project  was  conceived  during  the 
period  and  with  the  help  of  ONR  funding,  follow-up  studies  and  demonstration  of  the 
feasibility  of  this  approach  were  completed  through  leveraged  funding  (National  Science 
Foundation). 


H.  Development  of  a  database  tool  for  retro-biosynthetic  pathway  design. 

In  the  process  of  proposing  pathways  for  glucaric  acid  and  other  compounds  in  our  lab, 
we  found  the  current  enzyme  and  metabolism  databases  to  be  sub-optimal  for  the 
selection  of  enzymes  according  to  generalized  enzyme  functions.  For  example,  we  were 
unable  to  search  for  “aldehydes”  as  a  functional  group  and  obtain  a  list  of  the  generic 
substrates  from  which  an  aldehyde  could  be  produced,  connected  to  the  appropriate 
enzyme  activity.  To  address  this  problem,  we  developed  a  database  called  ReBiT  (Retro- 
Rzosynthesis  Tool)  that  catalogs  enzyme  activities  as  defined  by  the  first  3  digits  of  the 
EC  number  together  with  only  the  functional  groups  that  are  converted/produced  in  the 
substrate/product  pair.  Using  the  database,  one  can,  for  example,  identify  all  enzymatic 
reactions  that  lead  to  the  formation  of  a  primary  amine  as  a  functional  group.  ReBiT 
contains  605  structures  involved  in  637  enzymatic  reactions.  The  database  can  be 
searched  by  drawing  structures,  entering  SMILES  notation,  or  browsing  a  list  of 
functional  group  names.  One  can  also  browse  images  of  all  functional  groups.  The 
database  has  been  made  publicly  available  through  a  web  interface,  http://www.retro- 
biosynthesis.com.  ReBiT  was  also  described  in  a  manuscript  (Martin  and  Prather,  2008). 


IV.  Conclusions 

Our  goal  in  establishing  this  project  was  to  develop  methodologies  and  tools  to  aid  in  the 
design  and  assembly  of  novel  biosynthetic  pathways.  To  both  investigate  the  challenges 
inherent  in  this  approach  and  demonstrate  the  feasibility  of  novel  pathway  design,  we 
chose  to  design  and  assembly  routes  towards  glucaric  acid,  a  compound  of  increasing 
commercial  interest.  Through  this  work,  we  successfully  achieved  microbial  synthesis  of 
this  compound  for  the  first  time.  We  also  encountered  difficulties,  including  the  lack  of 
enzyme  databases  to  facilitate  chemistry-focused  searches,  a  lack  of  appropriate 
bioinformatic  tools  to  identify  needed  enzymes,  and  poor  activity  of  mammalian  enzymes 
in  a  bacterial  host.  In  confronting  these  difficulties,  however,  we  also  developed  new 
tools,  both  computational/database  and  experimental  in  nature.  These  include 
mathematical  models  to  help  prioritize  designed  pathways  and  novel  protein  scaffolds  to 
boost  activity  of  key  enzymes  (both  accomplished  through  collaborations).  In  summary, 
we  achieved  our  initial  goal  of  microbial  synthesis  of  glucaric  acid  and  demonstrated  the 
promise  of  novel  pathway  design  for  biological  production  of  organic  compounds. 
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V.  Significance 

The  significance  of  this  work  can  be  assessed  from  two  perspectives.  First,  our  ability  to 
de  novo  design  and  assemble  pathways  towards  synthesis  of  an  organic  compound  serves 
as  a  proof-of-concept  of  the  ability  to  branch  out  from  known  biological  routes  towards 
the  synthesis  of  desired  targets  in  the  absence  of  a  pre-defined  route.  As  we  seek  to 
uncover  sustainable  sources  of  fuels  and  chemicals,  the  tools  we  have  developed  as  well 
as  our  experiences  in  establishing  the  benchmark  pathway  should  facilitate  future 
pathway  design/assembly  efforts.  Second,  we  chose  as  our  target  compound  glucaric 
acid,  a  molecule  that  has  been  declared  as  a  “top  value-added  chemical”  and  which  is 
often  characterized  as  a  molecule  with  much  potential  save  for  the  current  expense  of 
manufacturing.  Thus,  our  work  could  potentially  lead  to  an  alternative,  sustainable,  and 
more  economical  means  of  producing  this  high-value  compound. 
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first  described  in  the  ONR  proposal  have  subsequently  been  integrated  into  a 
successfully  funded  proposal  for  an  NSF-sponsored  Engineering  Research  in 
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Synthetic  Biology  (SynBERC).  Since  the  end  of  ONR  support  on  3  lMay2008, 
this  project  has  been  supported  by  SynBERC. 

—  Future  plans  for  technology  transfer  -  Glucaric  acid  has  been  identified  as  a 
"top-valued  added"  product  from  biomass.  We  anticipate  that  continued 
improvements  in  production  of  this  compound  through  biological  means  will 
generate  interest  from  biomass  companies. 
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